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ABSTRACT 

A  box-tree  is  a  bounding- volume  hierarchy  that  uses  axis- 
aligned  boxes  as  bounding  volumes.  The  query  complexity 
of  a  box-tree  with  respect  to  a  given  type  of  query  is  the 
maximum  number  of  nodes  visited  when  answering  such  a 
query.  We  describe  several  new  algorithms  for  construct¬ 
ing  box-trees  with  small  worst-case  query  complexity  with 
respect  to  queries  with  axis-parallel  boxes  and  with  points. 
We  also  prove  lower  bounds  on  the  worst-case  query  com¬ 
plexity  for  box-trees,  which  show  that  our  results  are  optimal 
or  close  to  optimal.  Finally,  we  present  algorithms  to  con¬ 
vert  box- trees  to  R-trees,  resulting  in  R-trees  with  (almost) 
optimal  query  complexity. 

1.  INTRODUCTION 

Motivation  and  problem  statement  Window  queries  re¬ 
port  all  objects  of  a  given  set  that  intersect  a  query  d- 
rectangle,  that  is,  a  d-dimensional  box.  Preprocessing  a  set 
S  of  geometric  objects  in  Rd  for  answering  such  queries  is 
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central  to  many  applications  and  has  been  widely  studied  in 
several  areas,  including  computational  geometry,  computer 
graphics,  spatial  databases,  GIS,  and  robotics  [7,  17].  In 
order  to  expedite  and  simplify  the  data  structure,  a  win¬ 
dow  query  is  often  answered  in  two  steps.  In  the  first  step, 
called  the  filtering  step,  each  object  is  replaced  by  the  small¬ 
est  box  containing  the  object  and  the  query  procedure  re¬ 
ports  the  bounding  boxes  that  intersect  the  query  window. 
(Instead  of  boxes,  other  simple  shapes  such  as  spheres,  el¬ 
lipsoids,  cylinders  have  also  been  used.)  The  second  step, 
called  the  refinement  step,  extracts  the  actual  objects  among 
these  bounding  boxes  that  intersect  the  query  window  [4, 
19].  A  few  recent  results  show  that  under  certain  reasonable 
assumptions  on  the  input  objects,  the  number  of  bound¬ 
ing  boxes  intersecting  a  query  window  is  not  much  larger 
than  the  number  of  objects  intersecting  the  window,  which 
makes  this  approach  quite  attractive;  see  the  paper  by  Zhou 
and  Suri  [21]  and  the  references  therein.  There  has  been 
much  work  on  the  filtering  step,  and  we  also  focus  on  this 
step.  More  precisely,  we  wish  to  preprocess  a  set  S  of  n 
d-reet  angles  in  so  that  all  rectangles  of  S  intersecting  a 
query  d-rectangle  can  be  reported  efficiently.  We  will  refer 
to  this  query  as  the  rectangle-intersection  query .  A  related 
query  is  the  rectangle- containment  query  in  which  we  want 
to  report  all  rectangles  in  S  that  contain  a  query  point. 

A  number  of  data  structures  with  good  provable  bounds 
for  answering  rectangle-intersection  queries  have  been  pro¬ 
posed.  Unfortunately  they  are  of  limited  practical  use  be¬ 
cause  the  amount  of  storage  used  is  rather  high:  0(n  log  n) 
storage  and  even  0{n)  storage  with  a  large  hidden  con¬ 
stant  axe  often  unacceptable.  Therefore  in  practice  one  usu¬ 
ally  uses  simpler  data  structures.  A  commonly  used  struc¬ 
ture  for  answering  rectangle-intersection  queries,  rectangle- 
containment  queries,  and  in  fact  many  other  types  of  queries 
is  the  bounding-box  hierarchy ,  or  box-tree  for  short,  some¬ 
times  also  called  AABB-tree:  this  is  a  tree  T,  in  which  each 
leaf  is  associated  with  a  rectangle  of  the  input  set  5,  and 
each  interior  node  v  is  associated  with  the  smallest  box  Bv 
enclosing  all  the  rectangles  stored  at  the  leaves  of  the  sub- 
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tree  rooted  at  v.  All  the  rectangles  of  S  intersecting  a  query 
rectangle  R  are  reported  by  traversing  T  in  a  top-down  man¬ 
ner.  Suppose  the  query  procedure  is  visiting  a  node  v.  If 
Bv  H  R  =  0,  there  is  nothing  to  do.  If  Bv  C  R, ,  then  it 
reports  all  rectangles  stored  in  the  subtree  rooted  at  v .  Fi¬ 
nally,  if  Bu  D  R  7^  0  but  Bv  g  R,  it  recursively  visits  the 
children  of  v.  We  say  that  R  crosses  a  node  v  if  Bu  H  R  ^  0 
and  Bv  g  R.  If  the  fan-out  of  T  is  bounded,  then  the  query 
time  is  proportional  to  the  number  of  nodes  of  T  that  R 
crosses  plus  the  number  of  rectangles  reported.  We  define 
the  stabbing  number  of  T  to  be  the  maximum  number  of 
its  nodes  crossed  by  a  rectangle.  It  is  therefore  desirable 
to  construct  a  bounding-box  hierarchy  with  small  stabbing 
number. 

In  many  applications,  especially  in  the  database  appli¬ 
cations,  the  set  S  is  too  large  to  fit  in  the  main  memory, 
therefore  it  is  stored  on  disk.  In  that  case,  the  main  goal  is 
to  minimize  the  number  of  disk  accesses  needed  to  answer  a 
window  query,  and  the  performance  of  an  algorithm  is  ana¬ 
lyzed  under  the  standard  external  memory  model  [2].  This 
model  assumes  that  each  disk  access  transmits  a  contiguous 
block  of  t  units  of  data  in  a  single  input/output  operation  (or 
I/O).  The  efficiency  of  a  data  structure  is  measured  in  terms 
of  the  amount  of  disk  space  it  uses  (measured  in  units  of  disk 
blocks),  the  number  of  I/Os  required  to  answer  a  query,  and 
the  number  of  I/Os  needed  to  construct  the  data  structure. 
In  the  context  of  bounding-box  hierarchies,  several  schemes 
have  been  proposed  that  construct  a  tree  as  above  but  in 
which  the  fanout  of  each  node  depends  on  t.  Some  notable 
examples  of  external-memory  bounding-box  hierarchies  are 
various  variants  of  R- trees;  see  the  survey  paper  [11].  We 
can  still  define  the  crossing  nodes  and  the  stabbing  number 
as  earlier,  and  one  can  argue  that  the  number  of  I/Os  needed 
to  answer  a  query  is  proportional  to  the  stabbing  number 
plus  the  output  size. 

In  this  paper  we  study  the  problem  of  constructing  bound¬ 
ing-box  hierarchies,  both  in  main  and  external  memory,  that 
have  low  stabbing  number,  and  consequently,  low  query 
complexity. 

Previous  results .  As  noted  above,  several  efficient  data 
structures  have  been  proposed  for  answering  a  rectangle- 
intersection  query.  For  example,  Chazelle  [5]  showed  that  a 
compressed  range  tree  can  be  used  to  answer  a  d-dimensional 
rectangle-intersection  query  in  time  0(logd'"1  n  4-  k)  using 
0(n  \ogd~ 1  nf  log  log  n)  space  (where  k  is  the  number  of  rect¬ 
angles  reported).  This  data  structure  is  too  complex  to  be 
practical  even  in  R2.  As  for  bounding  volume  hiearchies,  we 
know  of  only  one  result  on  the  query  complexity  of  rectangle- 
intersection  queries  (besides  the  results  on  R-trees  discussed 
later):  if  one  maps  each  d-rectangle  to  a  point  in  R2d,  con¬ 
structs  a  kd-tree  on  these  points,  and  converts  the  kd-tree 
back  to  a  box- tree,  then  the  query  time  is  known  to  be 
0(nl~  1i2d  +  k)  [1,  15].  A  number  of  heuristics  based  on 
kd- trees  have  also  been  proposed  to  answer  rectangle-inter¬ 
section  queries  [1,  18].  Several  papers  [12,  14]  describe  how 
to  construct  bounding-box  hierarchies  or  other  bounding- 
volume  hierarchies  (for  example,  using  fc-DOPs  as  bounding 
volumes),  but  they  do  not  obtain  bounds  on  the  worst-case 
query  complexity.1 


^arequet  et  al.  [3]  gave  an  algorithm  to  construct  a 
bounding-box  hierarchy  in  R2 ,  and  they  claimed  that  if  the 


Some  of  the  most  widely  used  external-memory  bounding- 
box  hierarchies  are  the  R-tree  and  its  variants.  An  R-tree, 
originally  introduced  by  Guttmann  [13],  is  a  5-tree,  each  of 
whose  leaves  is  associated  with  an  input  rectangle.  All  leaves 
of  an  R-tree  are  at  the  same  level,  the  degree  of  all  internal 
nodes  except  of  the  root  is  between  t  and  2 1,  for  a  given  pa¬ 
rameter  t)  and  the  degree  of  the  root  varies  between  2  and 
2 1.  We  will  refer  to  t  as  the  minimum  degree  of  the  tree. 
To  minimize  the  query  complexity,  several  methods  have 
been  proposed  [9,  10,  11,  16]  for  ordering  the  input  rect¬ 
angles  along  the  leaves — varying  from  simple  heuristics  to 
space  filling  curves — -but  none  of  them  guarantee  the  worst- 
case  performance.  In  the  worst  case,  a  linear  number  of 
bounding  boxes  might  intersect  a  query  rectangle  even  if 
it  intersects  only  0(1)  input  rectangles.  The  only  analyti¬ 
cal  results  are  by  Theodoridis  and  Sellis[20],  who  present  a 
model  that  predicts  the  average  performance  of  R-trees  for 
range  queries,  and  Faloutos  et  al.  [10],  but  they  prove  bounds 
on  the  query  time  only  in  the  1-dimensional  case  when  the 
input  intervals  are  uniformly  distributed  and  have  at  most 
two  different  lengths.  Recently,  de  Berg  et  al  [6]  described 
an  algorithm  for  constructing  an  R-tree  on  rectangles  in  R2 
so  that  all  k  rectangles  containing  a  query  point  can  be  re¬ 
ported  in  0((<t  +  log p)  log  nj  log  t)  I/Os.  Here  p  is  the  ratio 
of  the  maximum  and  the  minimum  ^-lengths  of  the  input 
rectangles,  and  a  is  the  point- stabbing  number  of  S,  that 
is,  a  is  the  maximum  number  of  input  rectangles  containing 
any  point  in  the  plane.  For  a  rectangle-intersection  query, 
the  number  of  I/Os  is  0((<r  4-  log  p  4-  w  *f  k)  logn/  logt), 
where  w  is  the  ratio  of  the  x-length  of  the  query  rectangle 
to  the  smallest  x-length  of  an  input  rectangle. 

Our  results.  In  this  paper  we  first  describe  several  algo¬ 
rithms  for  constructing  box- trees,  and  we  prove  lower  bounds 
on  the  worst-case  query  complexity  of  box- trees.  The  lower 
bounds  actually  hold  for  all  bounding  volume  hierarchies 
that  use  convex  shapes  as  bounding  volumes. 

Our  first  algorithm,  like  the  approach  mentioned  earlier,  is 
based  on  a  kd-tree  in  R2d .  By  changing  the  structure  slightly 
and  doing  a  more  careful  analysis,  we  are  able  to  obtain 
0{n l~lfd  -f  k)  query  complexity  for  rectangle-intersection 
queries.  We  also  prove  a  lower  bound  showing  that  this 
bound  is  optimal. 

For  disjoint  input  in  the  plane,  we  show  how  to  con¬ 
struct  a  box-tree  that  still  has  almost  optimal  query  time  for 
rectangle-intersection  queries,  but  much  better  query  times 
for  point  queries.  In  fact,  it  is  already  better  for  point- 
queries  when  the  point-stabbing  number  a  of  the  input  is 
o(n/log4n):  the  time  for  rectangle-intersection  queries  is 
0(y/n  log  n  +  y/a\og2  n  +  k),  and  the  time  for  point  queries 
is  0{y/B\og2  n  +  k).  We  also  develop  a  box-tree  with  0((a  + 
y/B)  log2  n+k)  query  time  for  use  with  query  rectangles  with 
aspect  ratio  a.  One  would  hope  that  similar  improvements 
are  possible  in  higher  dimensions.  One  of  our  lower-bound 
results  shows  that  this  is  not  possible:  in  dimensions  d  >  3, 
the  Sl(n1'~l/d+ k)  lower  bound  on  the  query  complexity  holds 
even  for  hypercubes  as  query  ranges,  and  any  bounding-box 
hierarchy  that  achieves  this  query  time  cannot  have  a  better 
worst-case  query  time  for  point  queries,  even  when  the  input 
consists  of  disjoint  ‘almost-unit- hypercubes  \ 

rectangles  in  S  are  pairwise  disjoint,  then  the  resulting  hi¬ 
erarchy  has  O(logn)  stabbing  number.  But  the  argument 
presented  in  the  paper  has  a  technical  problem. 
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Finally,  we  give  general  methods  to  convert  box- trees  with 
small  query  complexity  into  R-trees  with  small  query  com¬ 
plexity,  When  we  apply  these  results  to  our  box- trees,  we 
improve  the  result  of  de  Berg  ei  aL  [6]:  our  query  complex¬ 
ity  does  not  depend  on  the  parameter  w  (which  makes  their 
query  complexity  linear  in  the  worst  case),  and  it  is  linear 
in  y/a  instead  of  in  {7,  We  also  introduce  the  concept  of 
semi- R-trees;  these  are  similar  to  ordinary  R-trees — the  de¬ 
gree  of  each  internal  node,  except  for  the  root,  is  between  t 
and  2 1  for  some  given  parameter  t — except  that  the  leaves 
do  not  have  to  be  at  the  same  level.  We  give  a  general  al¬ 
gorithm  to  convert  a  box-tree  with  small  query  complexity 
into  a  semi-R-tree  with  small  query  complexity;  the  bound 
obtained  here  is  better  than  that  for  R-trees.  This  leads  to 
semi-R-trees  with  (almost)  optimal  query  complexity. 

All  box-tree  construction  algorithms  in  this  paper  run  in 
0(n  log  n)  time,  and  all  box-tree-to-(semi-)R-tree  conversion 
algorithms  run  in  0(n)  time. 

2.  LOWER  BOUNDS 

In  this  section  we  give  lower  bounds  on  the  query  complex¬ 
ity  of  semi-R-trees  of  minimum  degree  t  in  various  settings. 
Since  semi-R-trees  are  more  general  than  R-trees,  the  same 
bounds  hold  for  R-trees.  By  choosing  t  =  2,  we  obtain  lower 
bounds  for  box-trees. 

We  start  with  a  simple  generalization  of  the  2-dimensional 
lower  bound  given  by  de  Berg  ei  aL  [6]. 

Theorem  2.1,  For  any  n  and  d  >  2,  there  is  a  set  of  n 
disjoint  unit  hypercubes  in  Rd  with  the  following  property: 
for  any  semi-R-tree  T  of  minimum  degree  i  there  is  a  query 
box  not  intersecting  any  box  from  S  such  that  a  query  with 
that  box  visits  0((n/t)1-1^)  nodes  in  T. 

Proof,  Consider  a  set  of  n  unit  hypercubes  arranged  in 
an  n1^4  x  *  *  *  x  nl^d  grid,  and  the  following  set  of  query 
ranges:  for  each  axis,  we  choose  n1/d  —  1  thin  boxes  orthog¬ 
onal  to  it  and  separating  the  Alices’  of  the  grid  from  each 
other.  Now  any  bounding  box  on  i  hypercubes  intersects 
at  least  d(tl/d  -  1)  of  the  query  ranges.  Hence,  the  total 
number  of  incidences  between  the  ranges  and  the  bounding 
boxes  is  at  least  O ((n/t)  -i1/d).  As  there  are  0(n1/d)  ranges, 
there  must  be  one  that  intersects  Q(( n/t )1~1/a)  bounding 
boxes.  □ 

Next  we  describe  a  construction  that  proves  lower  bounds 
on  rectangle-containment  queries  and  that  is  also  useful  for 
a  number  of  other  cases.  For  any  e  >  0,  we  call  a  d-rectangle 
an  e-hypercube  if  the  length  of  each  edge  is  between  1  and 
1  +  e.  We  fix  a  parameter  p  >  1  and  construct  a  set 
S  —  {6(0), , , ,  ,6(n  —  1)}  of  n  e-hypercubes  in  Rd.  We  also 
construct  two  sets  of  query  points  Qi  and  Q2?  called  primary 
and  secondary  point  sets,  that  lie  in  the  common  exterior  of 
the  rectangles  in  S  and  have  the  following  property:  for  any 
semi-R-tree  T  on  S  with  minimum  degree  t,  either  a  point 
of  Qi  lies  in  at  least  u  bounding  boxes  of  T  or  a  point  of  Q2 
lies  in  fl((n/i)/p1^  ~1J)  bounding  boxes  of  T.  From  this 
we  derive  the  desired  lower  bounds.  We  first  describe  the 
set  S  and  then  construct  the  point  sets. 

Let  ni , , , . ,  n2d  be  the  outward  normals  of  a  d-rectangle. 
We  can  pair  these  normals  into  d  pairs  (nn ,  7112),  (nai,  7122), 
*  *  *  ?  (n^i,  7142)  so  that  no  pair  contains  opposite  normals, 
that  is,  nn  ^  —m2  for  1  <  i  <  d.  Let  hi  be  the  2-plane 


spanned  by  the  vectors  nn  and  m2  and  containing  the  ori¬ 
gin.  Let  6  be  a  d-rectangle  containing  the  origin.  Since 
nn  ^  —m2,  the  facets  /u,/h  of  6  normal  to  nn  and  m2, 
respectively,  share  a  (d  -  2)-face  /*,  which  is  orthogonal  to 
the  2-plane  hi .  The  intersection  of  f%  and  hi  is  a  point  a. 
Conversely,  by  specifying  a  point  a  on  each  hi}  l  <  i  <  dy 
we  can  represent  a  unique  d-rectangle  in  which  a  lies  on  the 
facets  normal  to  mi  and  nn.  We  will  therefore  define  each 
rectangle  b(j)  €  £  by  a  d-tuple  (ci(j), , . .  ycd(j)),  where  the 
facets  of  b(j)  whose  outward  normals  are  nn  and  m2  pass 
through  a(j ).  We  next  describe  how  to  choose  the  points 
for  1  <  i  <  d  and  0  <  j  <  m 

On  each  2-plane  hi ,  we  choose  a  line  lx  of  slope  -1;  the 
exact  equation  of  will  be  specified  below.  We  will  refer  to 
hi  as  the  primary  plane,  and  to  h,*,  for  *  >  1,  as  a  secondary 
plane .  Set  p  —  p .  We  place  n  points  px(0), . » .  ,pi(n— 
1)  on  l\  (sorted  along  h  by  ascending  nn -coordinate,  and 
consequently,  by  descending  7^2-coordinate)  and  set  Ci  (j)  = 
Pitt)  for  every  0  <  j  <  n.  For  each  %  >  1,  we  place  p  points 
Pi(G),.,.,Pi{£  -  1)  on  £i  and  assign  a(j)  to  these  points 
as  follows.  Let  a(j)  ~  (ao(j), . . .  yad-2(j))  be  the  repre¬ 
sentation  of  j  mod  p  in  radix  p,  that  is,  fyj^ak(j)pk  = 
j  mod  p.  For  each  i  >  1,  we  set  a(j)  =  Pi(otd-i(j))*  Note 
that  nfp  points  have  the  same  value  of  a (j).  We  choose  £% 
and  the  points  on  £%  so  that  each  bj  is  an  s-hypercube,  e.g. 
by  putting  all  points  pi(j)  at  a  distance  of  at  least  1/2  and 
at  most  (1  +  e)/2  from  the  origin,  both  in  their  projection 
on  the  na-axis  and  on  the  n?2-axis. 

Finally,  we  choose  a  set  Qi  of  n  —  1  points  on  the  primary 
plane  hi  and  a  set  Q2  of  (d— l)(p— 1)  points  on  the  secondary 
planes,  as  follows.  Suppose  hi  is  the  ria^-plane.  For  each 

1  <  J  <  n  —  1,  we  choose  the  point  q(j)  =  (xi(pi(j  - 
1))}  £2 (pi  (j)))  and  add  it  to  Qi .  In  other  words,  if  we  regard 
the  points  on  £\  as  the  convex  corners  of  a  staircase,  Qi  is 
the  set  of  concave  corners  of  the  staircase.  To  construct  Q2, 
we  repeat  the  same  step  for  each  of  the  secondary  planes, 
thus  obtaining  p  —  1  points  on  each  of  them.  These  points 
will  be  on  the  boundary  of  some  of  the  input  boxes,  but  we 
can  shift  them  a  little  to  make  them  disjoint  from  all  input 
boxes. 

Lemma  2.2.  Let  T  be  any  semi-R-tree  of  minimum  de¬ 
gree  t  on  the  set  S  constructed  above.  Then  either  there 
is  a  primary  query  point  contained  in  O (p)  bounding  boxes 
stored  in  T,  or  one  of  the  secondary  query  points  is  con¬ 
tained  in  fl(n/(tp1^d~1^))  bounding  boxes  stored  in  T* 

Proof.  We  first  prove  the  lemma  for  box- trees,  which 
are  binary  trees.  Suppose  that  all  primary  query  points 
are  contained  in  less  than  p/2  bounding  boxes  stored  in 
the  interior  nodes  in  T\  Then  the  number  of  incidences 
between  these  points  and  interior  nodes5  bounding  boxes  is 
at  most  (n  —  l)p/2.  Since  there  are  n  —  1  interior  nodes 
in  T,  they  store  at  least  (n  —  l)/2  bounding  boxes  that 
contain  less  than  p  primary  query  points.  Observe  that  a 
bounding  box  for  input  boxes  b(j)yb(f)  €  S  contains  \j-f\ 
primary  query  points,  because  there  are  that  many  concave 
corners  in  the  staircase  between  corners  c\  (j)  and  ci  (/),  We 
conclude  that  there  are  at  least  (n  —  1) /2  bounding  boxes 
that  store  boxes  b(j)yb(jf)  (and  perhaps  some  more  boxes) 
with  \j-f\  <  p.  But  if  \j-f\  <  p  then  j  ^  j*  (mod  p),  so 
aU)  #  <*00.  Tkis  implies  that  there  is  at  least  one  i  with 

2  <  i  <  d  such  that  a(j)  ^  a (/).  Hence,  the  bounding  box 
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storing  6(j),6(/)  will  contain  one  of  the  secondary  query 
points.  So  in  total  we  have  at  least  (n  —  l)/2  incidences 
between  secondary  query  points  and  bounding  boxes,  so  one 
of  the  (d  -  l)(p  -  1)  =  0(p1/(d_1))  secondary  query  points 
is  contained  in  ft(n/ ^1/(d*-1))  bounding  boxes. 

The  generalization  to  semi-R-trees  follows  easily  from  the 
observation  that  a  semi-R-tree  of  minimum  degree  t  has 
0(n/t)  nodes.  If  each  primary  query  point  is  contained  in 
less  than  pf 2  bounding  boxes,  we  then  get  Q(n/t)  nodes 
whose  bounding  box  contains  less  than  p  primary  query 
points.  Prom  that  point  on,  we  can  basically  follow  the 
argument  above.  □ 

We  can  use  this  lemma  to  prove  lower  bounds  for  several 
settings.  By  substituting  p  =  (n/t)1"1^,  we  prove  the  fol¬ 
lowing  lower  bound  for  point  queries. 

Theorem  2.3.  For  any  n,  d  >  2,  and  e  >  0,  there  is  a  set 
S  of  n  s-hypercubes  in  with  the  following  property:  for 
any  semi-R-tree  T  of  minimum  degree  t  there  is  a  point  not 
contained  in  any  box  from  S  such  that  a  query  with  that 
point  visits  Q((n/t)1~1^d)  nodes  in  T. 

Next,  we  modify  the  above  construction  so  that  the  same 
bound  can  be  achieved  in  d  >  3  even  if  the  input  consists  of 
a  set  of  n  disjoint  e-hypercubes  and  the  queries  are  hyper¬ 
cubes. 

Theorem  2.4.  For  any  n,  d  >  3,  and  e  >  0,  there  is  a 
set  S  of  n  disjoint  e-hypercubes  in  Rd  with  the  following 
property:  for  any  semi-R-tree  T  of  minimum  degree  t  there 
is  a  hypercube  not  intersecting  any  box  from  S  such  that  a 
query  with  that  hypercube  visits  fl((n/t)1~1^d)  nodes  in  T . 

Proof.  We  apply  a  variant  of  the  construction  above 
with  p  =  to  obtain  a  set  of  (d  —  l)-dimensional 

boxes  in  the  hyperplane  x\  =  0.  The  variation  is  that  we 
treat  all  planes  on  which  we  put  the  corners  as  secondary 
planes.  We  use  the  remaining  dimension  to  make  the  boxes 
into  d-dimensional  s-hypercubes,  and  we  translate  each  box 
into  the  x\ -direction  such  that  they  become  disjoint  and 
intersect  the  x i -axis  in  the  order  6(1),  6(2), . . .  ,  6(n).  In  be¬ 
tween  every  pair  6(j),6(j  -I- 1)  we  put  a  query  point.  These 
n  - 1  query  points  play  the  role  of  the  primary  query  points. 
The  secondary  query  points  are  replaced  by  query  ranges 
which  are  hypercubes.  We  can  do  that  in  such  a  way  that 
the  intersection  of  such  a  range  with  a  secondary  plane  is 
a  square  that  misses  S  and  that  has  one  corner  coinciding 
with  the  secondary  query  points  we  had  previously.  It  is 
easy  to  see  that  the  bound  in  Lemma  2.2  still  holds.  □ 

Finally,  we  observe  that  the  proof  of  the  preceding  theo¬ 
rem  actually  shows  that  in  higher  dimensions  any  semi-R- 
tree  with  small  (say,  polylogarithmic)  query  complexity  for 
points  must  have  large  (near-linear)  query  complexity  for 
ranges.  More  precisely,  it  shows  the  following  result. 

Theorem  2.5.  For  any  n,  d  >  3  and  e  >  0,  there  is  a 
set  S  of  n  disjoint  £-hypercubes  in  Rd  with  the  following 
property:  for  any  semi-R-tree  T  of  minimum  degree  t,  if  the 
number  of  nodes  visited  by  any  point  query  is  p,  then  there 
is  a  hypercube  not  intersecting  any  box  from  S  such  that 
a  query  with  that  hypercube  visits  Q(n/(tp1^d'~1^))  nodes 

in  r. 


3.  FROM  KD-TREES  TO  BOX-TREES 

In  this  section  we  describe  and  analyze  several  methods  to 
construct  box-trees  using  kd-trees.  For  convenience  we  will 
allow  our  box- trees  to  have  nodes  of  degree  up  to  2d  -f  3 — it 
is  easy  to  convert  these  trees  to  binary  trees  without  affect¬ 
ing  the  asymptotic  bounds  on  the  query  complexity.  Query 
ranges  (other  than  points)  will  be  assumed  to  be  open,  while 
input  boxes,  bounding  boxes  and  cells  in  space  decomposi¬ 
tions  are  closed. 

3.1  The  configuration-space  approach 

The  basic  method.  Let  S  be  a  set  of  n  arbitrary,  possibly 
overlapping,  d-rectangles  in  Rd ,  which  we  call  the  workspace. 
As  noted  in  the  introduction,  we  can  represent  a  d-rectangle 

b  =  UUilx7(b)>xt(.b)]  a  Point  (x7(b)>x2(b)>->xd(b)> 
xf{b),X2 "(6), xj(6))  in  R2d ,  which  we  call  the  configu¬ 
ration  space.  We  build  a  2d- dimensional  kd-tree  on  these 
points. 

A  kd-tree  is  a  binary  space  decomposition  tree,  which  is 
used  to  index  points.  Every  node  in  a  2d-dimensional  kd- 
tree  is  associated  with  a  cell,  which  is  a  2d-rect angle,  and 
an  axis-parallel  splitting  hyperplane.  The  splitting  plane 
divides  the  cell  into  two  rectangular  subcells,  one  for  each 
child  of  the  node. 

The  root  cell  is  chosen  large  enough  to  contain  all  input 
points.  The  tree  is  then  built  recursively  by  determining 
splitting  planes  for  all  cells.  The  orientations  of  the  splitting 
planes  depend  on  the  level  in  the  tree,  in  such  a  way  that  all 
possible  orientations  (2d  in  this  case)  take  turns  in  a  round- 
robin  fashion  on  any  path  down  into  the  tree.  The  location 
of  each  splitting  plane  is  chosen  such  that  the  numbers  of 
input  points  in  the  resulting  sub  cells  differ  by  at  most  one. 
When  a  cell  contains  only  one  input  point,  we  make  it  a  leaf 
of  the  tree  and  do  not  split  it  further. 

To  transform  the  kd-tree  in  configuration  space  into  a 
box-tree  in  workspace,  proceed  as  follows.  Replace  the  rep¬ 
resentative  point  in  each  leaf  by  the  corresponding  input 
box.  Then,  going  bottom-up,  store  in  each  internal  node 
the  bounding  box  of  its  children.  We  call  the  resulting  box- 
tree  a  configuration- space  box-tree ,  or  cs-box-tree  for  short. 

In  the  introduction  we  pointed  out  that  it  can  be  used  to 
do  rectangle-intersection  queries  in  0(nl~lt2d  +  k)  time;  in 
this  paper  we  will  show  how  to  improve  the  upper  bound  to 
0(n1_1/d  -I-  k). 

For  the  analysis  of  the  range  query  complexity  of  the  cs- 
box-tree,  we  need  the  following  fact  about  kd-trees,  given 
here  without  proof. 

Lemma  3.1.  The  number  of  cells  at  depth  i  in  a  d-dim- 
ensional  kd-tree  that  intersect  an  axis-parallel  /-flat  (0  < 
f  <d)  is  0( 2if/i). 

A  kd-tree  and,  hence,  our  box-tree  has  the  following  prop¬ 
erty:  the  number  of  objects  stored  in  the  two  subtrees  of  any 
given  node  differ  by  at  most  one.  We  call  such  trees  perfectly 
balanced.  The  perfect  balance  in  our  box-tree  will  be  advan¬ 
tageous  when  we  will  convert  it  to  an  R-tree.  We  can  now 
analyze  the  range  query  complexity  of  a  cs-box-tree. 

Lemma  3.2.  Let  S  be  a  set  of  n  possibly  intersecting 
boxes  in  the  plane.  There  is  a  perfectly  balanced  box-tree  for 
S  such  that  the  number  of  nodes  at  level  i  that  are  visited  by 
a  range  query  with  an  axis-aligned  box  is  0(2t^1“1^^  +  k), 
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where  k  is  the  number  of  boxes  in  S  intersecting  the  query 
range .  The  box-tree  can  be  built  in  0(n  log  n)  time. 

Proof,  Let  Q  =  YYi=i(x7  (Q)>xt  (Q))  be  a  query  range. 
We  can  restrict  our  attention  to  the  interior  nodes  visited, 
since  the  number  of  visited  leaves  is  at  most  one  more.  We 
distinguish  two  types  of  visited  interior  nodes  v.  The  first 
type  is  where  at  least  one  of  the  input  boxes  stored  in  the 
subtree  of  v  intersects  Q.  Obviously  there  are  only  O(k)  such 
nodes  at  a  given  level  i.  The  second  type  is  where  all  input 
boxes  in  the  subtree  of  v  are  disjoint  from  Q.  The  interior 
of  any  input  box  disjoint  from  Q  must  be  separated  from  Q 
by  a  hyperplane  through  a  facet  of  Q.  Not  all  input  boxes 
are  separated  from  Q  by  the  same  hyperplane,  otherwise  the 
bounding  box  of  v  would  not  intersect  Q  and  v  would  not 
be  visited.  Hence,  there  are  at  least  two  such  hyperplanes 
separating  Q  from  an  input  box  in  the  subtree  of  v. 

Assume  w.l.o.g.  that  Xi  =  x~  (Q)  is  one  of  these  separating 
hyperplanes,  and  let  b  be  the  input  box  it  separates  from  Q , 
Then  we  must  have  xf  (b)  <  x~(Q),  But  there  must  also 
be  a  box  bf  with  xf(bf)  >  xf( Q),  otherwise  the  bounding 
box  of  v  would  not  intersect  Q.  We  conclude  that  the  points 
representing  b  and  bf  in  the  configuration  space  lie  on  oppo¬ 
site  sides  of  the  hyperplane  xf  =  x~(Q).  Consequently,  the 
hvperplane  xf  =  x~  ( Q )  intersects  the  cell  in  configuration 
space  of  the  node  in  the  kd-tree  corresponding  to  v. 

We  can  apply  the  same  argument  to  the  second  hyper¬ 
plane  separating  Q  from  an  input  box  (the  hyperplane  Xj  ~ 
xf  (Q),  for  example),  to  show  that  there  is  a  hyperplane 
in  configuration  space  with  points  on  or  on  opposite  sides 
(xj  =  xf  (Q)  in  the  example). 

We  can  conclude  the  following.  Suppose  Q  visits  a  node 
v  of  the  second  type.  Then  in  configuration  space  there 
is  a  pair  of  hyperplanes,  both  of  the  form  xf  ~  x~(Q)  or 
x~  —  xf  (Q)  and  both  intersecting  the  cell  in  configuration 
space  of  the  kd-tree  node  corresponding  to  v.  But  then 
the  cell  is  also  intersected  by  the  (2d  -  2)-flat  that  is  the 
intersection  of  these  two  hyperplanes.  By  Lemma  3,1  there 
are  only  0(2i(2i-2)/2i)  =  o(2i(l~1/d))  such  nodes  at  level  i. 

For  the  building  time,  see  section  3,4,  □ 

This  leads  directly  to  the  following  theorem. 

Theorem  3.3.  Lei  S  be  a  set  of  n  possibly  intersecting 
boxes  in  the  plane.  There  is  a  perfectly  balanced  box-tree 
for  S  such  that  the  number  of  nodes  visited  by  a  range  query 
with  an  axis-aligned  box  is  Ofn1-1^1  +  klogn),  where  k  is 
the  number  of  boxes  in  S  intersecting  the  query  range.  The 
box-tree  can  be  built  in  0(n  log  n)  time. 

Proof.  From  Lemma  3,2  we  get  a  bound  for  the  stabbing 
number  on  each  level  in  the  tree.  Since  a  kd-tree  has  height 
[log ri\ ,  so  has  a  cs-box-tree,  and  summation  over  all  levels 
yields  a  total  query  complexity  of  ]Cl~f  ^  0( 2^1-1/^  +k)  ~ 
0(nx^l^d  +  k  log  7i).  □ 

Improving  the  query  time.  We  now  show  how  to  reduce 
the  0(k  log  n)  term  in  the  query  complexity  to  0(k).  The 
idea  is  the  same  as  in  a  priority  search  tree  [7]:  input  ele¬ 
ments  (boxes  in  our  case)  that  have  a  high  chance  of  being 
reported  are  pushed  to  high  levels  in  the  tree.  In  our  case, 
the  boxes  that  extend  farthest  in  one  of  the  Xi -directions  are 


stored  high  in  the  tree.  More  precisely,  the  construction  of 
the  tree  T  for  a  set  5  of  boxes  in  Wtd  is  as  follows. 

If  \S\  =  1,  then  T  consists  of  a  single  leaf  node  storing  the 
input  box  in  S .  Otherwise  we  make  a  node  v  storing  the 
bounding  box  Bv  of  all  boxes  in  S,  and  proceed  as  follows. 

For  each  of  the  2d  inner  normals  of  the  facets  of  Bv ,  take 
the  box  from  S  that  extends  farthest  in  the  direction  of  that 
normal.  This  results  in  a  set  S*  of  at  most  2d  boxes.  Each 
box  in  S*  is  put  in  a  so-called  priority  leof}  which  is  an 
immediate  child  of  v . 

If  the  set  S  \  S*  of  remaining  boxes  contains  less  than 
two  boxes,  then  this  box  (if  it  exists)  is  put  as  a  leaf  child 
of  v.  If  two  or  more  boxes  remain,  we  split  the  set  of  boxes 
into  two  (almost)  equal-sized  subsets  with  an  axis-parallel 
hyperplane  in  configuration  space.  Like  in  a  normal  kd-tree, 
the  orientation  of  the  splitting  plane  depends  on  the  level  in 
the  tree,  so  that  all  2d  orientations  take  turns  in  a  round- 
robin  fashion  on  any  path  from  the  root  down  into  the  tree. 

The  subset  of  boxes  whose  representative  points  lie  to 
one  side  of  the  cutting  hyperplane  are  stored  recursively  in 
one  subtree  of  v.  The  subset  of  boxes  whose  representative 
points  lie  to  the  other  side  of  the  cutting  hyperplane  are 
stored  recursively  in  another  subtree  of  v. 

Next  we  analyze  the  query  complexity  of  the  tree  resulting 
from  this  construction,  which  we  call  a  cs-prioriiy-box-iree. 
In  our  analysis  we  bound  the  number  of  visited  nodes  of  a 
given  weight,  where  the  weight  of  a  node  is  defined  as  the 
number  of  input  boxes  stored  in  its  subtree.  This  will  be 
useful  when  we  convert  this  box-tree  into  a  semi-R-tree, 

Lemma  3,4.  The  number  of  nodes  of  weight  at  least  w 
visited  by  a  query  with  a  query  box  Q  is  0((njw)l~lfd  +k). 

Proof,  Let  Q  =  TLi=i(x7(Q)*xt(Q))-  We  can  restrict 
our  attention  to  the  visited  nodes  of  weight  at  least  2d,  as 
the  total  number  of  visited  nodes  is  at  most  a  constant  times 
larger  than  this  number.  Let  v  be  such  a  visited  node  of 
weight  at  least  2d.  There  are  two  cases. 

The  first  case  is  where  one  of  the  priority  leaves  directly 
below  v  stores  a  box  intersecting  Q.  Clearly  there  are  at 
most  k  such  nodes. 

The  second  case  is  when  all  priority  leaves  directly  below  v 
store  boxes  disjoint  from  Q.  Thus  each  such  box’s  Interior  is 
separated  from  Q  by  a  hyperplane  through  a  facet  of  Q,  We 
claim  that  not  all  boxes  can  be  separated  by  the  same  hy¬ 
perplane,  Suppose  for  a  contradiction  that  there  is  a  facet  / 
whose  containing  hyperplane  separates  all  boxes  of  the  pri¬ 
ority  leaves  from  Q,  Then  in  particular  it  would  separate 
the  box  that  extends  farthest  in  the  direction  of  the  inner 
normal  of  the  facet  /,  contradicting  that  Q  intersects  the 
bounding  box  stored  at  v.  So  we  have  two  distinct  hyper¬ 
planes  through  facets  of  Q  separating  a  box  in  the  subtree 
of  v  from  Q. 

The  box-tree  that  we  have  constructed  basically  corre¬ 
sponds  to  a  kd-tree  in  configuration  space,  as  before.  The 
priority  leaves  make  that  the  tree  in  configuration  space  is 
strictly  speaking  not  a  kd-tree,  but  it  is  easy  to  see  that 
Lemma  3.1  still  holds.  Moreover,  there  is  still  a  one-to-one 
correspondence  between  nodes  of  the  box-tree  and  nodes  of 
the  kd-tree  in  configuration  space.  Hence,  we  can  use  the 
fact  that  there  are  two  distinct  hyperplanes  through  facets 
of  Q  separating  a  box  in  the  subtree  of  v  from  Q  in  the  same 
way  as  in  the  proof  of  Lemma  3.2:  It  implies  that  there  is 
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a  (2 d  —  2)-flat  in  configuration  space  (defined  by  a  pair  of 
facets  of  Q)  intersecting  the  cell  in  the  kd-tree  corresponding 
to  v.  It  follows  that  the  total  number  of  nodes  v  to  which 
the  second  case  applies  at  a  given  level  i  is  0( 2t(1-1/d)). 

To  finish  the  proof,  observe  that  nodes  at  the  lowermost 
Llog(tu/(2(f))J  levels  have  weight  less  than  w.  Adding  the 
bounds  for  the  second  case  on  the  remaining  levels,  we  get 

Eli°o  nl"llog<W(2<f))J  0(2i(1"1/d))  =  Odn/w)1-1^). 

For  the  building  time,  see  section  3.4.  □ 

The  following  theorem  follows  directly. 

Theorem  3.5.  Let  S  be  a  set  of  n  possibly  intersecting 
boxes  in  Rd .  There  is  a  box-tree  for  S  such  that  the  number 
of  nodes  that  are  visited  by  a  range  query  with  an  axis- 
aligned  box  is  0(nl~lfd  d-k),  where  k  is  the  number  of  boxes 
in  S  intersecting  the  query  range.  The  box- tree  can  be  built 
in  0(n  log  n)  time. 

3.2  The  kd-interval-tree  approach 

The  cs-box-tree  of  the  previous  section  has  optimal  query 
complexity  for  point  queries  (and  range  queries)  if  the  input 
consists  of  arbitrary,  intersecting  boxes.  Unfortunately,  if 
the  input  boxes  are  disjoint  then  the  query  complexity  for 
point  queries  does  not  improve.  In  this  section  we  develop 
a  different  box-tree,  the  kd-interval  tree ,  whose  query  com¬ 
plexity  is  much  better  if  <r,  the  point-stabbing  number  of 
the  input  set  S,  is  small.  The  query  complexity  for  range 
queries  increases  only  slightly.  This  approach  only  works  in 
the  plane;  Theorem  2.5  states  that  a  similar  result  in  more 
than  two  dimensions  cannot  be  obtained. 

The  basic  idea  behind  kd-interval  trees  is  again  to  use  a 
kd-tree,  but  this  time  in  the  workspace  (which  is  now  the 
plane).  Since  the  objects  in  the  workspace  are  rectangles, 
not  points,  many  of  them  may  intersect  the  cutting  line. 
These  boxes  are  taken  out  and  handled  separately,  like  in 
an  interval  tree.  To  make  kd-interval  trees  more  efficient, 
we  introduce  priority  leaves,  like  in  the  previous  section. 

The  1 -dimensional  case.  First  we  describe  how  a  set  S 
of  boxes  all  intersecting  a  given  line  i  are  handled.  With 
a  slight  abuse  of  terminology,  we  call  a  tree  for  this  case  a 
1-dimensional  kd-interval  tree. 

If  \S\  =  1,  then  T  consists  of  a  single  leaf  node  storing  the 
input  rectangle  in  S.  Otherwise  we  make  a  node  v  storing 
the  bounding  box  Bv  of  all  rectangles  in  5,  and  proceed  as 
follows. 

For  each  of  the  4  inner  normals  of  the  edges  of  take 
the  rectangle  from  S  that  extends  farthest  in  the  direction  of 
that  normal.  This  results  in  a  set  S*  of  at  most  4  rectangles. 
Each  rectangle  in  S*  is  put  in  a  priority  leaf. 

Consider  the  set  of  intersections  of  the  edges  of  the  re¬ 
maining  rectangles  with  £.  Let  p  be  the  median  of  these  in¬ 
tersection  points.  The  rectangles  in  S\S*  containing  p  are 
stored  in  a  subtree  of  u  that  is  a  2- dimensional  cs- priority- 
box-tree  as  described  in  the  previous  section.  The  rectangles 
in  S\S*  completely  to  one  side  of  p  are  stored  recursively 
as  a  1-dimensional  kd-interval  tree  in  a  second  subtree  of  v. 
The  rectangles  in  S  \  S*  completely  to  the  other  side  of  p 
are  stored  recursively  in  another  subtree  of  v. 

We  call  the  nodes  in  the  main  1-dimensional  kd-interval 
tree  ID-nodes.  Such  a  node  corresponds  to  an  interval  on 
the  defining  line  i.  We  call  the  nodes  of  the  2- dimensional 
cs-priority-box-trees  cs-nodes. 
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Figure  1:  Querying  a  1-dimensional  kd-interval  tree 
with  a  box  Q. 


We  start  by  analysing  the  query  complexity  when  we  query 
with  a  segment  on  the  line  £. 

Lemma  3.6.  If  we  query  a  1-dimensional  kd-interval  tree 
storing  a  set  S  of  n  rectangles  with  a  line  segment  on  the 
defining  line  £,  then  we  visit  at  most  0(log  n  -f  k)  nodes , 
where  k  is  the  number  of  rectangles  to  be  reported. 


Proof.  Observe  that  the  query  segment  s  intersects  a 
rectangle  (or  bounding  box)  if  and  only  if  it  intersects  the 
intersection  of  that  rectangle  (or  bounding  box)  with  £. 

Consider  a  lD-node  that  is  visited  when  we  query  with  s. 
When  the  interval  corresponding  to  this  node  is  completely 
contained  in  s)  then  by  the  above  observation  all  rectangles 
in  the  subtree  intersect  s.  Hence,  there  cannot  be  more 
than  0(k)  such  nodes.  When  the  interval  is  not  completely 
contained  in  s,  then  it  contains  an  endpoint  of  s,  and  there 
are  only  0(log  n)  such  nodes. 

Now  consider  a  cs-node  u  that  is  visited.  Let  p  be  the 
point  on  £  common  to  all  rectangles  in  the  subtree  of  v.  As¬ 
sume  w.l.o.g.  that  £  is  vertical  and  p  lies  inside  or  above  s. 
Then  the  rectangle  in  the  subtree  extending  farthest  down¬ 
ward  must  intersect  s.  This  rectangle  is  stored  in  a  priority 
node  directly  below  u}  so  we  can  charge  the  visit  of  v  to  this 
answer.  □ 

Next  we  analyze  the  query  complexity  when  we  query  with 
a  box. 

Lemma  3.7.  (i)  If  we  query  a  1-dimensional  kd-interval 
tree  storing  a  set  S  of  n  rectangles  with  a  query  box  Q,  then 
we  visit  at  most  0(y/(r/w  log  n-\-k)  nodes  of  weight  at  least 
w,  where  k  is  the  number  of  rectangles  to  be  reported . 

(ii)  If  a  is  0(log  n/  log  log  n),  then  the  query  time  reduces 
to  0(logn  -I-  k). 

(iii)  If  the  projection  of  Q  onto  the  line  £  that  stabs  the 
rectangles  in  S  contains  the  intersections  of  all  rectangles 
with  £,  then  the  query  time  reduces  to  0(k). 


Proof,  (i)  See  Figure  1.  If  Q  intersects  £  then  the  query 
is  equivalent  to  querying  with  Q  n  so  the  result  follows 
from  the  previous  lemma.  Otherwise,  assume  w.l.o.g.  that 
£  is  vertical  and  that  Q  lies  to  the  right  of  £.  Consider  a 
ID-node  v  that  is  visited  when  we  query  with  Q.  When  the 
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interval  corresponding  to  this  node  is  completely  contained 
in  the  projection  of  Q  onto  £,  then  the  rectangle  in  the  sub¬ 
tree  extending  farthest  to  the  right  must  be  intersected.  This 
rectangle  is  stored  in  a  priority  leaf  immediately  below  v,  to 
which  we  can  charge  the  visit  of  v .  Hence*  there  can  be  at 
most  k  such  nodes.  When  the  interval  is  not  completely  con¬ 
tained  in  the  projection  of  Q ,  then  it  contains  an  endpoint  of 
the  projection  of  Q ,  and  there  axe  only  0(log  n)  such  nodes. 

Now  consider  a  2-dimensional  cs-priority-box-tree  that  is 
visited.  Suppose  the  interval  of  the  ID-node  that  is  the 
parent  of  this  subtree  is  completely  contained  in  Q.  Then 
we  can  argue  again  (using  the  priority  leaves)  that  we  can 
charge  all  the  visited  nodes  to  rectangles  intersecting  Q,  If 
the  interval  of  the  ID-node  that  is  the  parent  of  this  subtree 
is  not  completely  contained  in  the  projection  of  Q,  we  argue 
as  follows.  First  observe  that  the  interval  must  then  contain 
an  endpoint  of  the  projection  of  Q,  so  there  are  only  0(log  n) 
such  parent  nodes.  In  the  2-dimensional  configuration-space 
box-tree  below  such  a  parent,  we  apply  Lemma  3.4  to  bound 
the  number  of  visited  nodes  of  weight  w  by  0(^/nf fw  +  k*)} 
where  tl  is  the  number  of  boxes  stored  in  the  cs-priority- 
box-tree  and  k!  is  the  number  of  answers  reported  in  this 
subtree.  Note  that  nf  <  a,  since  the  cs-box-trees  are  used 
only  to  store  sets  of  boxes  that  share  a  single  point.  Hence, 
the  overall  number  of  cs-nodes  visited  is  0{y/afw  logn-f  &), 
finishing  the  proof  of  part  (i)  of  the  lemma. 

(ii)  For  the  proof  of  part  (ii),  we  analyze  the  number  of 
cs-nodes  visited  in  a  different  way.  Note  that  cs-nodes  in  a 
single  cs-priority-box-tree  share  a  single  point  on  t  If  this 
point  is  contained  in  the  projection  of  Q  onto  €,  then  we 
can  use  the  priority  nodes  to  charge  all  nodes  visited  in  this 
cs-box-tree  to  rectangles  intersecting  Q. 

If  the  defining  point  of  a  cs-prority-box-tree  lies  outside 
the  projection  of  Q  onto  £,  then  all  cs-nodes  visited  in  this 
cs-box-tree  must  have  a  rectangle  that  contains  an  endpoint 
of  the  projection  of  Q.  In  all  cs-box-trees  together,  there 
can  be  at  most  0(cr  logcr)  such  nodes  in  total,  since  at  most 
2<t  leaf  nodes  can  contain  one  of  the  two  endpoints,  and  all 
cs-box-trees  have  height  O(loga). 

In  total,  we  find  a  bound  of  0( log  n  +  a  log  a  +  k),  which 
reduces  to  0(log  n  4*  k)  if  a  is  0(\ogn/  log  log  n). 

(iii)  If  the  projection  of  Q  onto  £  contains  the  intersections 
of  all  rectangles  with  £,  it  also  contains  all  intervals  corre¬ 
sponding  to  the  nodes  in  the  box-tree.  Therefore,  we  can 
use  the  priority  leaves  again  to  charge  all  the  visited  nodes 
to  rectangles  intersecting  Q.  □ 


The  2-dimensional  case ,  Our  kd-interval  tree  for  a  general 
set  S  of  rectangles  in  the  plane  is  defined  as  follows. 

If  |S|  =  1,  then  T  consists  of  a  single  leaf  node  storing  the 
input  box  in  S,  Otherwise  we  make  a  node  v  storing  the 
bounding  box  Bv  of  all  boxes  in  S,  and  proceed  as  follows. 

For  each  of  the  4  inner  normals  of  the  edges  of  Bv ,  take 
the  rectangle  from  S  that  extends  farthest  in  the  direction  of 
that  normal.  This  results  in  a  set  S*  of  at  most  4  rectangles. 
Each  rectangle  in  S*  is  put  in  a  priority  leaf ,  which  is  an 
immediate  child  of  v . 

If  the  set  S\S*  of  remaining  rectangles  contains  less  than 
two  rectangles,  then  this  rectangle  (if  it  exists)  is  put  as  a 
leaf  child  of  v.  If  two  or  more  rectangles  remain,  we  split 
the  cell  corresponding  to  v  using  a  vertical  or  horizontal 
line  (depending  on  the  level  v  in  the  tree).  This  splitting 


line  £  is  chosen  such  that  the  number  of  rectangles  in  S\S* 
lying  completely  to  either  side  of  £  is  at  most  [\S  \  S*  |/2J. 
The  rectangles  in  S  \  S*  tying  to  one  side  of  £  are  stored 
recursively  in  one  subtree  of  v.  The  rectangles  in  S\S*  lying 
to  the  other  side  of  £  are  stored  recursively  in  another  subtree 
of  v.  The  rectangles  in  S\S*  intersecting  £  are  stored  in  a  1- 
dimensional  kd-interval  tree,  as  explained  above.  We  call  the 
nodes  of  the  main  tree,  which  correspond  to  2-dimensional 
cells,  2D-nodes.  Next  we  analyze  the  performance  of  the 
kd-interval  tree. 

Lemma  3.8.  The  number  of  nodes  of  weight  at  least  w 
that  are  visited  by  a  range  query  with  an  axis-aligned  box 
is  0{y/ nfw  log  n  +  y^o/w  log2  n  -I-  k),  where  k  is  the  number 
of  reported  answers .  The  number  of  such  nodes  visited  by  a 
point  query  is  0(  y/ojw  log2  n+k) .  Ifo  is  0(  log  n  /  log  log  n) , 
we  may  omit  the  \fcrfw  factor . 

Proof.  Consider  a  2D-node  that  is  visited  when  we  query 
with  an  axis-aligned  rectangle  Q.  We  distinguish  four  differ¬ 
ent  types  of  such  nodes  (see  Figure  2  (i)).  We  bound  their 
number  and  the  number  of  nodes  visited  in  1-dimensional 
kd-interval-subtrees  for  each  type  separately. 

Inner  nodes :  These  are  2D-nodes  whose  bounding  boxes 
lie  completely  inside  Q.  The  number  of  inner  nodes  is  easy 
to  bound,  since  all  rectangles  in  the  subtree  of  such  a  node 
intersect  Q,  Hence,  the  total  number  of  such  nodes,  or  nodes 
in  their  1-dimensional  associated  kd-interval  trees,  is  0{k). 
Side  nodes:  These  are  2D-nodes  whose  bounding  boxes  cut 
exactly  one  edge  of  Q.  In  this  case  the  rectangle  that  extends 
farthest  into  the  direction  of  the  inner  normal  of  this  edge 
must  intersect  Q.  This  rectangle  is  stored  in  a  priority  leaf 
immediately  below  the  node.  The  same  reasoning  applies 
to  their  1-dimensional  associated  kd-interval  trees.  Hence, 
the  total  number  of  side  nodes  or  nodes  in  their  associated 
kd-interval  trees  is  0(k). 

Piercing  nodes:  These  are  2D-nodes  that  cut  two  opposing 
edges  of  Q,  but  do  not  contain  any  corners  of  Q,  From 
Lemma  3.1  and  the  fact  that  all  nodes  at  the  lowermost 
[log(iy/(2d))J  levels  of  the  tree  must  have  weight  less  than 
w}  we  conclude  that  the  number  of  2D-nodes  with  weight 
at  least  w  that  intersect  any  edge  of  Q  must  be  bounded  by 
Enorl-Liog(™/(2d))J  0(2.72)  =  Now  there  are 

two  cases — see  Figure  2(ii):  the  splitting  line  used  at  such  a 
node  v  is  orthogonal  to  the  intersected  edges,  or  it  is  parallel 
to  them.  In  the  former  case  we  can  apply  Lemma  3.6  to  ob¬ 
tain  a  0(log  n+k*)  bound  on  the  number  of  nodes  visited  in 
the  1-dimensional  kd-interval  tree  associated  with  v,  where 
kf  is  the  number  of  reported  answers.  In  the  latter  case  we 
can  apply  Lemma  3.7(iii)  to  get  a  bound  of  0(kt ).  Hence, 
we  get  a  grand  total  of  0{%Jnjw  log  n  +  k). 

Comer  nodes:  These  are  2D-nodes  that  contain  one  or  more 
corners  of  Q .  There  are  0(log  n)  such  nodes.  To  obtain  the 
total  number  of  visited  nodes  in  the  associated  1-dimensional 
kd-interval  trees,  we  have  to  multiply  this  by  the  bound  of 
Lemma  3.7,  leading  to  a  total  of  O{*jofw log2  n  +  k)  in  the 
general  case,  or  0(log2  n  +  k)  if  o  is  0(log  nj  log  log  n). 

There  are  no  other  types  of  nodes  whose  bounding  boxes 
intersect  Q.  Adding  up  the  number  of  nodes  for  all  four 
cases  gives  the  desired  bound  for  box- queries.  Note  that  in 
the  case  of  point  queries,  we  only  have  comer  nodes.  For 
the  building  time,  see  section  3.4.  □ 

This  leads  to  the  following  theorem. 


130 


piercing 


ZJ - 

side 

corner 

Figure  2:  (i)  Four  different  types  of  2D-nodes  with  respect  to  a  query  range  Q .  (ii)  Piercing  nodes  with 
parallel  splitting  lines  (to  the  left)  and  orthogonal  splitting  lines  (to  the  right). 


Theorem  3.9.  Let  S  be  a  set  of  n  possibly  intersecting 
boxes  in  the  plane,  such  that  no  single  point  is  contained 
in  more  than  cr  boxes.  There  is  a  box-tree  for  S  such  that 
the  number  of  nodes  visited  by  a  range  query  with  an  axis- 
aligned  box  is  0(y/n\ogn  +  x/irlog2  n  +  k),  where  k  is  the 
number  of  boxes  in  S  intersecting  the  query  range.  The 
number  of  nodes  visited  by  a  point  query  is  0(y/ a  log2  n+k). 
If  a  is  0(logn/\og\ogn),  this  reduces  to  0(log2n).  The 
box-tree  can  be  built  in  0(n  log  n)  time. 

3.3  The  longest-side-first  approach 

Recall  that  a  kd-interval  tree  is  basically  a  modified  kd- 
tree,  where  each  node  is  split  by  a  line.  The  orientations  of 
these  lines  depend  on  the  level  in  the  tree  in  such  a  way,  that 
orientations  take  turns  in  a  round-robin  fashion  on  any  path 
from  the  root  down  into  the  tree.  An  interesting  variation 
of  the  kd-interval  tree  arises  when  we  replace  the  round- 
robin  splitting  strategy  by  the  longest-side  splitting  rule  as 
suggested  by  Dickerson  et  al.  [8].  In  such  a  longest-side- first 
kd-interval  tree,  the  number  of  nodes  whose  corresponding 
cell  is  pierced  by  a  query  rectangle  is  small  if  the  query 
rectangle  is  fat.  We  use  this  to  prove  the  following  lemma. 

Lemma  3.10.  The  number  of  nodes  of  weight  at  least  w 
that  are  visited  by  a  range  query  with  an  axis-aligned  box 
is  0((a  +  \fo~jw)  log2  n  +  k),  where  k  is  the  number  of  re¬ 
ported  answers.  The  number  of  such  nodes  visited  by  a  point 
query  is  0(yJ o / w  log2  n  +  k).  If  o  is  O (log  n/ log  log  n),  the 
0(n/ct/u>)  factor  can  be  omitted  from  the  bounds. 

Proof.  In  the  analysis  in  the  previous  subsection,  the 
piercing  nodes  were  responsible  for  the  0(\/n/w  log  n)  term 
in  the  query  complexity.  This  term  arose  because  in  a  nor¬ 
mal  kd-tree  there  can  be  0{\/nJw)  piercing  nodes,  and 
in  each  of  the  associated  1-dimensional  kd-interval  trees, 
O(logn)  nodes  could  be  visited. 

In  the  longest-side-first  kd-tree,  however,  the  number  of 
disjoint  cells  that  cut  opposing  sides  of  a  query  rectangle 
of  aspect  ratio  a  is  O(alogn)  [8].  As  before,  we  have  two 
types  of  piercing  nodes:  those  with  splitting  lines  that  are 
orthogonal  to  the  intersected  edges  of  Q ,  and  those  with 
parallel  splitting  lines.  For  the  first  case,  observe  that  such 
splitting  lines  separate  two  disjoint  cells  that  cut  opposing 
sides  of  the  query  rectangle.  This  implies  that  there  can  be 
at  most  0(a  log  n)  piercing  nodes  with  orthogonal  splitting 
lines,  each  of  which  can  have  a  1-dimensional  kd-interval 


tree  in  which  0(log  ra  +  fc')  nodes  are  visited.  For  the  second 
case,  observe  that  the  total  number  of  piercing  nodes  on  all 
levels  in  the  tree  is  at  most  0(a  log2  n),  and  each  of  them  can 
have  a  1-dimensional  kd-interval  tree  in  which  O(k')  nodes 
are  visited.  Hence,  we  get  a  grand  total  of  0(a  log2  n  +  k) 
for  both  types  of  piercing  nodes. 

Since  the  other  cases  in  the  analysis  of  the  original  kd-tree 
still  go  through,  the  lemma  follows.  □ 

Theorem  3.11.  Let  S  be  a  set  of  n  boxes  in  the  plane 
with  stabbing  number  cr.  There  is  a  box-tree  for  S  such  that 
the  number  of  nodes  that  are  visited  by  a  range  query  with  a 
rectangular  range  of  aspect  ratio  a  is  0((a  +  y/o)  log2  n  +  k), 
where  k  is  the  number  of  boxes  in  S  intersecting  the  query 
range.  The  number  of  such  nodes  visited  by  a  point  query 
is  O(y/o\og2n  +  k).  If  a  is  0(\og  n/  log  log  n),  the  0(y/a) 
factor  can  be  omitted  from  the  bounds.  The  box-tree  can 
be  built  in  O(nlogn)  time. 

3.4  Building  the  box-trees 

All  boxtrees  mentioned  in  this  section,  can  be  built  in 
0(n  log  n)  time.  Since  the  construction  algorithms  are  very 
similar,  we  will  explain  them  together. 

We  start  by  sorting  all  input  boxes  by  xj -coordinate  and 
x^-coordinate  for  all  dimensions  1  <  i  <  d.  This  costs 
O(nlogn)  time.  Using  suitable  list  structures  and  cross¬ 
pointers,  we  can  now  do  the  following  operations: 

•  in  0(1)  time,  selecting  a  box  with  an  extreme  value 
for  one  of  the  2d  coordinates  and  removing  it  from  the 
2d  sorted  lists; 

•  in  0(1)  time,  determine  the  bounding  box  of  the  set 
(and,  if  necessary,  determine  the  dimension  in  which 
the  bounding  box  is  largest); 

•  in  0(n)  time,  splitting  the  set  of  boxes  in  two,  such 
that  all  boxes  whose  value  for  a  particular  coordinate 
is  smaller  than  the  median  for  that  coordinate  go  in 
one  list,  while  the  remaining  boxes  go  in  the  other  list, 
and  at  the  same  time  splitting  the  2d  sorted  lists  in 
sorted  lists  for  each  of  the  two  subsets. 

•  in  0(n)  time,  splitting  the  set  of  boxes  in  three  subsets 
S~,  S °  and  S4*  with  respect  to  some  discriminating 
dimension  i ,  such  that  there  is  a  value  x°  such  that  all 
boxes  in  S~  are  on  one  side  of  the  hyperplane  Xi  =  x?, 
all  boxes  in  are  on  the  other  side,  and  all  boxes  in 
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S°  intersect  the  plane,  \S~\  <  n/2  and  |S+|  <  nj 2 — 
and  at  the  same  time,  splitting  the  2d  sorted  lists  in 
sorted  lists  for  each  of  the  three  subsets. 

With  these  operations  we  can  do  the  construction  of  our 
box-trees  in  time  0(n  log  n).  The  details  are  omitted  from 
this  abstract. 

4.  FROM  BOX-TREES  TOR-TREES 

In  the  previous  section  we  described  several  algorithms  to 
construct  box-trees  with  good  query  complexity.  In  this  sec¬ 
tion  we  give  general  theorems  to  convert  them  to  (semi-)R- 
trees. 

We  start  with  a  general  theorem  that  converts  any  box- 
tree  to  an  Ft- tree.  Recall  that  the  weight  of  a  box-tree  node 
is  the  number  of  input  boxes  stored  in  its  subtree. 

Theorem  4.1.  Let  T  be  a  box-tree  for  a  set  of  n  boxes 
in  Rd  such  that  any  query  with  a  range  of  a  given  type  visits 
at  most  f(w)  nodes  of  weight  w  or  more .  Then  T  can  be 
converted  in  0(n)  time  to  an  R-tree  of  minimum  degree  t 
where  every  query  with  a  range  of  the  same  type  visits  at 
most  0(f(i)  log  nj  log  t)  nodes , 

Proof.  We  simply  read  out  the  leaves  from  T  in  order, 
and  then  construct  an  R-tree  where  the  boxes  occur  in  the 
same  order  in  the  leaves.  We  can  build  this  R-tree  bottom- 
up,  level  by  level.  First  we  construct  the  R-tree  nodes  just 
above  leaf  level  by  repeatedly  taking  2t  leaves  from  the  list 
and  giving  them  a  new  R-tree  node  as  their  parent.  We  con¬ 
tinue  doing  this  until  less  than  4 1  leaves  are  without  parent: 
these  leaves  are  then  divided  into  two  groups  (if  there  are 
more  than  2t)  or  made  children  of  a  single  parent  (if  there 
are  no  more  than  2 1  leaves  left).  Next,  we  consider  the  new 
parent  nodes  just  constructed  as  leaves,  and  construct  the 
next  level  of  the  tree,  and  so  on,  until  we  reach  the  level 
where  only  one  node  is  constructed  (the  root).  In  this  wav, 
we  spend  0(1)  time  for  each  node  to  connect  it  to  a  parent 
node,  thus  getting  a  total  running  time  of  0(n). 

Consider  a  bounding  box  B  stored  in  the  R-tree.  It  is 
the  bounding  box  for  some  input  boxes  that  were  stored  in 
consecutive  leaves  in  the  box-tree  T.  Let  v(B)  be  the  lowest 
common  ancestor  of  these  leaves.  Since  the  minimum  degree 
in  the  R-tree  is  t,  the  weight  of  v(B)  is  t  or  more.  Further¬ 
more,  the  nodes  v(B)  for  the  bounding  boxes  B  stored  at 
a  fixed  level  in  the  R-tree  must  be  distinct,  because  their 
defining  sets  form  a  partition  of  the  leaves  in  T  into  consec¬ 
utive  sequences.  Hence,  we  can  charge  the  visited  nodes  of 
the  R-tree  to  visited  nodes  of  weight  i  or  more  in  T,  in  such 
a  way  that  a  node  in  T  does  not  get  charged  more  than  once 
from  nodes  at  a  fixed  level  in  the  R-tree.  Since  the  depth  of 
the  R-tree  is  0(log  nj  logt),  the  bound  follows.  □ 

The  construction  of  Theorem  4.1  results  in  losing  a  logarith¬ 
mic  factor  in  the  query  complexity.  Next  we  show  how  to 
improve  this  result  for  perfectly  balanced  box-trees.  Recall 
that  a  box-tree  is  called  perfectly  balanced  if  for  any  node 
the  weight  of  its  left  and  right  child  differ  by  at  most  one. 

THEOREM  4.2.  Let  T  be  a  perfectly  balanced  box-tree  for 
a  set  of  n  boxes  in  Rd  such  that  any  query  with  a  range  of 
a  given  type  visits  at  most  f{i)  nodes  at  level  i  in  T,  Then 
T  can  be  converted  in  0(n)  time  to  an  R-tree  of  minimum 
degree  t  where  every  query  with  a  range  of  the  given  type 
visits  at  most  /(ilogi))  nodes. 


Proof.  We  first  prove  that  any  perfectly  balanced  tree 
has  the  following  property:  the  weights  of  all  nodes  at  a 
fixed  level  in  the  tree  differ  by  at  most  one.  The  proof  is 
by  induction  on  the  level.  The  statement  is  trivially  true  at 
level  zero  (the  level  of  the  root).  Now  assume  all  nodes  at 
at  a  given  level  have  weight  w  or  w  4- 1.  Then  the  balancing 
condition  guarantees  that  the  nodes  at  the  next  level  have 
weight  w/2  or  ty/2+1  (in  case  w  is  even)  or  they  have  weight 
(u?  +  l)/2  —  1  or  (w  +  l)/2  (in  case  w  is  odd).  So  in  both 
cases  the  weights  at  the  next  level  differ  by  at  most  one. 

We  can  now  construct  an  R-tree  from  T  as  follows.  From 
the  leaf  level,  walk  up  the  tree  until  a  level  %  is  encountered 
where  all  nodes  have  weight  at  least  t.  Thus  there  must  be 
at  least  one  node  with  weight  at  most  t  —  1  on  the  level  just 
below  i,  and  therefore,  by  the  perfect  balance  property,  no 
node  on  that  level  has  weight  more  than  t.  This  implies 
that  the  weight  of  nodes  at  level  i  cannot  exceed  2 1.  Hence, 
all  the  nodes  at  this  level  can  be  compressed  in  a  single 
leaf  (which  will  be  a  node  in  the  R-tree  node).  Recurse  on 
the  new  tree.  The  recursion  ends  when  there  are  less  than 
i  leaves,  which  are  compressed  to  a  single  node  which  will 
form  the  root  of  the  R-tree, 

The  bound  on  the  query  complexity  immediately  follows 
from  the  construction.  The  details  of  a  construction  in  0(n ) 
time  are  omitted  from  this  abstract.  □ 

Finally,  we  can  show'  that  that  we  can  also  improve  Theo¬ 
rem  4,1  for  the  general  case  if  we  are  willing  to  settle  for 
semi-R-trees  instead  of  real  R-trees.  (Recall  that  the  dif¬ 
ference  between  a  semi-R-tree  and  an  R-tree  is  that  in  the 
former  we  do  not  require  all  leaves  to  be  at  the  same  depth.) 

Theorem  4,3.  Let  T  be  a  box-tree  for  a  set  of  n  boxes 
in  Rd  such  that  any  query  with  a  range  of  a  given  type  visits 
at  most  f(w)  nodes  of  weight  w  or  more .  Then  T  can  be 
converted  in  0(n)  time  to  a  semi-R-tree  of  minimum  degree 
i  where  every  query  with  a  range  of  the  same  type  visits  at 
most  0(f(t))  nodes. 

The  proof  is  omitted  from  this  abstract.  The  main  point 
is  that  during  the  conversion  from  box-tree  to  semi-R-tree, 
we  do  not  introduce  new  bounding  boxes,  no  bounding  box 
in  the  box-tree  appears  more  than  once  in  the  semi-R-tree, 
and  no  internal  nodes  with  weight  less  than  i  are  put  in  the 
semi-R-tree,  By  applying  the  conversion  algorithms  of  the 
theorems  above  to  the  structures  from  the  previous  section, 
wre  obtain  the  following  results. 

Corollary  4,4.  Let  S  be  a  set  of  n  boxes  in  Rd  with 
stabbing  number  a. 

(i)  There  is  an  R-tree  for  S  of  minimum  degree  i  such 
that  the  number  of  nodes  visited  by  any  box  query  is 
0((n/t)1_1/<i  -f-  fclog  nj  log  t)}  where  k  is  the  number 
of  reported  answers. 

(h)  There  is  an  semi-R-tree  for  S  of  minimum  degree  t  such 
that  the  number  of  nodes  visited  by  any  box  query  is 
0((n/t)1~1/d  +  k). 

(Hi)  When  d  =  2,  there  is  a  semi-R-tree  for  S  of  mini¬ 
mum  degree  t  such  that  the  number  of  nodes  visited 
by  any  box  query  is  0(^/nji log  n  +  \fojt  log2  n  +  &), 
and  the  the  number  of  nodes  visited  by  any  point  query 
is  O{\[o[t  log2  71  +  k) ,  In  both  bounds ,  k  is  the  num¬ 
ber  of  reported  answers .  If  o  is  0(log  nj  log  log  n),  the 
0{sfoji)  factor  can  be  omitted  from  the  bounds . 


(iv)  When  d  =  2,  there  is  a  semi-R-tree  for  S  of  minimum 
degree  t  such  that  the  number  of  nodes  visited  by  any 
query  with  a  rectangle  of  aspect  ratio  a  is  0((a  + 
s/oft)  log2  n  +  k),  where  k  is  the  number  of  reported 
answers.  If  a  is  0(log  nj  log  log  n),  the  bound  reduces 
to  0(alog2n  +  k). 

(v)  For  the  cases  mentioned  under  (Hi)  and  (iv)  there  is 
also  an  R-tree  of  minimum  degree  t  for  which  the  num¬ 
ber  of  visited  nodes  is  0( log  nj  log  t)  times  the  number 
of  visited  nodes  in  the  semi-R-tree . 

All  R-trees  can  be  constructed  in  O(nlogn)  time . 

5.  CONCLUSIONS 

We  have  developed  now  algorithms  to  construct  box-trees 
(bounding-volume  hierarchies  using  axis-aligned  boxes  as 
bounding  volumes)  and  we  analyzed  the  complexity  of  rect¬ 
angle-intersection  queries  and  point-containment  queries  for 
these  structures.  We  also  proved  lower  bounds  showing  that 
our  results  are  optimal  or  almost  optimal.  Finally,  we  gave 
algorithms  to  convert  our  box-trees  to  (semi-) R-trees  with 
optimal  or  almost  optimal  query  complexity. 

The  bounds  that  we  get,  except  for  the  case  of  fat  ranges 
in  the  plane,  are  rather  disappointing — even  though  they  are 
optimal.  In  practice,  one  would  hope  for  much  better  perfor¬ 
mance.  It  would  be  interesting  to  see  under  which  conditions 
one  can  obtain  better  bounds  for,  say,  box-queries  in  R3. 
We  also  would  like  to  see  how  our  trees  behave  in  practice — 
the  lower-bound  constructions  are  rather  contrived — and  to 
compare  them  experimentally  against  trees  constructed  by 
known  heuristics. 

In  many  applications  it  is  important  to  support  fast  inser¬ 
tions  and  deletions,  and  it  would  be  interesting  to  develop 
box- trees  or  R-trees  that  support  fast  insertion  and  deletion, 
while  still  guaranteeing  close  to  optimal  query  complexity. 
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